Facilitated diffusion of DNA-binding proteins: Simulation of large systems 
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The recently introduced method of excess collisions (MEC) is modified to estimate diffusion- 
controlled reaction times inside systems of arbitrary size. The resulting MEC-E equations contain a 
set of empirical parameters, which have to be calibrated in numerical simulations inside a test system 
of moderate size. Once this is done, reaction times of systems of arbitrary dimensions are derived 
by extrapolation, with an accuracy of fO to f 5 percent. The achieved speed up, when compared to 
explicit simulations of the reaction process, is increasing proportional to the extrapolated volume 
of the cell. 

PACS numbers: 87.16.Ac 



1. INTRODUCTION 

Diffusion controlled bio-chemical reactions play a cen- 
tral role in keeping any organism alive [l|, |2| : The trans- 
port of molecules through cell membranes, the passage 
of ions across the synaptic gap, or the search carried out 
by drugs on the way to their protein receptors are pre- 
dominantly diffusive processes. Further more, essentially 
all of the biological functions of DNA are performed by 
proteins that interact with specific DNA sequences 0,13 > 
and these reactions are diffusion-controlled. 

However, it has been realized that some proteins are 
able to find their specific binding sites on DNA much 
more rapidly than is 'allowed' by the diffusion limit 0,0- 
It is therefore generally accepted that some kind of facil- 
itated diffusion must take place in these cases. Several 
mechanisms, differing in details, have been proposed. All 
of them essentially involve two steps: the binding to a 
random non-specific DNA site and the diffusion (sliding) 
along the DNA chain. These two steps may be reiter- 
ated many times before proteins actually find their tar- 
get, since the sliding is occasionally interrupted by dis- 
sociation. Berg [j| and Zhou have provided thorough 
(but somewhat sophisticated) theories that allow esti- 
mates for the resulting reaction rates. Recently, Halford 
and Marko have presented a comprehensive review on 
this subject and proposed a remarkably simple and semi- 
quantitative approach that explicitly contains the mean 
sliding length as a parameter of the theory pj ■ This ap- 
proach has been refined and put onto a rigorous base in 
a recent work by the authors A plethora of scaling 
regimes have been studied for a large range of chain den- 
sities and protein-chain affinities in a recent work by Hu 
et al. 0| . 

The numerical treatment of such a reaction is effi- 
ciently done with the method of excess collisions [lj} 
(MEC), where the reverse process (protein departs from 
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the binding site and propagates toward the periphery 
of the cell) is simulated. This approach delivers ex- 
act results and a significant speed up when compared 
to straight forward simulations. Unfortunately, once 
very large systems are under investigation, the numer- 
ical treatment of the DNA chain (whose length is pro- 
portional to the volume of the cell) quickly turns into 
a bottleneck, since the MEC approach requires the con- 
struction of the cell in its full extent. Realistic cell models 
have to deal with thermal fluctuations of the chain and 
its hydrodynamic interaction, thereby imposing a strict 
limit to the size that can be managed. In the present 
work we demonstrate how to implement a modification 
of the MEC approach that allows to simulate a test sys- 
tem of reasonable size, followed by an extrapolation to 
cells of arbitrary size. 

After a definition of the problem in Sect. 12. ll the MEC 
approach is briefly summarized in Sect. 12.21 In Sect. 12.31 
the numerical implementation of facilitated diffusion is 
presented, and 12.41 delivers an analytical estimate for the 
reaction time. As a preparation for the random walk 
simulations, the chain is constructed in Sect. EH and the 
specific recurrence times are evaluated inside a small test 
system (Sect. 0J. In Sect. [51 random walk simulations 
are carried out in order to construct the empirical MEC- 
E equations. These are then employed to extrapolate 
the reaction times to cells of much larger dimensions in 
Sect. A comparison with exact solutions (in the case 
of free diffusion) and the analytical estimate of Sect. 12.41 
suggests that the MEC-E approach delivers an accuracy 
of 10 to 15 percent with a speed up of several orders of 
magnitude. 



2. METHODOLOGY 

2.1. Definition of the system 

As a cell we define a spherical volume of radius R, 
containing a chain ('DNA') of length L and a specific 
binding target of radius R a . The target is located in the 
middle of the chain, that in turn coincides with the cen- 
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ter of the cell. The state of the system is well defined 
with the position of a random walker ('protein'), which 
can either diffuse freely inside the cell or, temporarily, 
associate with the chain to propagate along the chain's 
contour (the numerical realization of this process is dis- 
cussed in detail in Sect. \2.?>\ . The distance of the walker 
from the center defines the (radial) reaction coordinate 
r. We shall further denote the periphery of the central 
target (at r = R a ) as state A and the periphery of the 
cell (r = R) as state B. To be investigated is the average 
reaction time tba the walker needs to propagate from B 
to A as a function of the binding affinity between walker 
and chain. 



2.2. Method of excess collisions (MEC) 



The MEC a ppr oach was presented in its full generality 
elsewhere fiol 111). In short, it allows to determine the 
reaction time tba while simulating the back reaction A 
— > B (average reaction time: tab) using the relation 



TBA = (iVcoU + 1) • TR - TAB 



(1) 



The walker starts at the center (r(i = 0) = 0) and propa- 
gates towards the periphery (r(t — tab) = R), a process 
that is much faster than its reversal (tab "C tba)- On its 
way to B, the walker may repeatedly return back to A; 
such an event is called collision, and iV co n stands for the 
average number of collisions, tr is the recurrence time 
and evaluated via 



f R V cS {R) , 

where we have defined the specific recurrence time 



V cS {Ro) 



(2) 



(3) 



a quantity, which is derived from simulations of the re- 
currence time t r within a small test system of the size 
of the central target (Sect. 0}. The effective volume is 
defined as 
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and depends upon the energy of the walker U(r) and 
hence the implementation of the binding potential be- 
tween walker and chain. 



2.3. Simple model for facilitated diffusion of 
DNA-binding proteins 

The nonspecific binding of the walker to the chain is 
accounted for by the attractive step potential 



U(s) 



-E 




s > r c , 



(5) 



where s is the shortest distance between walker and 
chain. This defines a pipe with radius r c around the 
chain contour that the walker is allowed to enter freely 
from outside, but to exit only with the probability 



p = exp(-£ , /fc B T) , 



(6) 



where k^T is the Boltzmann factor, otherwise it is re- 
flected back inside the chain. We may therefore denote 
p as exit probability. This quantity allows to define the 
equilibrium constant K of the two phases, the free and 
the non-specifically bound protein, according to 



c L \p 



(7) 



where c is the concentration of free proteins and a the 
linear density of non-specifically bound proteins. V c — 
"nr\L is the geometric volume of the chain. It should 
be noted that in our previous publication fTfjj , a was de- 
fined as a — cV c /(p L), with the disadvantage of being 
non-zero in case of vanishing protein-chain interaction 
(p = 1). The present choice defines a as the excess con- 
centration of proteins along the chain contour and leads 
to a vanishing sliding-length (Eq. I14f> in case of free dif- 
fusion. 

The specific binding site is a spherical volume, located 
in the middle of the chain and of identical radius, i.e. 
Ra = t c . Applying the walker-chain potential Eq. JSJ), 
the effective volume Eq. Q of the cell becomes 



v eS (R) = v + v c (~-l 

\P 

and that of the central target is simply 

M jw _ * _ ^ 

p 3p 



(8) 
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2.4. Analytical estimate for the reaction time and 
definition of the sliding length 

In case of free diffusion and for a spherical cell, Szabo 
ct al. have evaluated the exact solution for the time a 
walker needs to reach the radius R a , after starting at the 
periphery R, yielding 
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Here, D is the diffusion coefficient. In presence of the 
chain, exact solutions are known for simple geometrical 
setups only 0, but as discussed elsewhere 0, it is still 
possible to approximate the reaction time using an an- 
alytical approach, once certain conditions are satisfied. 
The resulting expression is 
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with the 'sliding' variable 



D ld K 
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(12) 



and -Did and D^d being the diffusion coefficients in 
sliding-mode and free diffusion, respectively. Generally, 
the equilibrium constant K has to be determined in sim- 
ulations of a (small) test system, containing a piece of 
chain without specific binding site. In the present model, 
K is known analytically via Eq. Q. If the step-size dr 
of the random walker is equal both inside and outside 
the chain (the direction of the step being arbitrary), we 
further have 
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and hence obtain 




(13) 



(14) 



This variable has got the dimension of length; as we have 
pointed out in pjj, it corresponds to the average sliding 
length of the protein along the DNA contour in the model 
of Halford and Marko pj and we shall henceforth use the 
same expression for £. In case of free diffusion (p = 1), 
the sliding length is zero and Eq. Ullfl simplifies to 



tba(£ = 0) 



3 R a D. 



3d 



(15) 



which equals Szabo's result Eq. <]l()[l in leading order of 
Rj R a . 



3. NUMERICAL MODEL 

In order to approximate the real biological situation, 
the DNA was modeled as a chain of straight segments of 
equal length l . Its mechanical stiffness was introduced 
in terms of a bending energy associated with each chain 
joint: 



E h = k B T 



B-i OL ( 



(16) 



where a represents the dimensionless stiffness parame- 
ter, and 9 the bending angle. The numerical value of a 
defines the persistence length (l p ), i.e. the "stiffness" of 
the chain. The excluded volume effect was taken into 
account by introducing the effective chain radius r c . The 
conformations of the chain, with distances between non- 
adjacent segments smaller than 2r c , were forbidden. The 
target of specific binding was assumed to lie exactly in 
the middle of the DNA. The whole chain was packed in 
a spherical volume (cell) of radius R in such a way that 
the target occupied the central position. 

To achieve a close packing of the chain inside the cell, 
we used the following algorithm. First, a relaxed confor- 
mation of the free chain was produced by the standard 




FIG. 1: Upper part: 2-dimensional projection of a 3- 
dimensional random chain-contour of length L — 400.2 (per- 
sistence lengths) confined inside a spherical cell of radius 
R — 6. Lower part: Radial chain density distribution, av- 
eraged over 20 conformations. Beyond r = 4 (dashed line), 
the density declines rapidly. 



Metropolis Monte-Carlo (MC) method. For the further 
compression, we defined the center-norm (c-norm) as the 
maximum distance from the target (the middle point) to 
the other parts of the chain. Then, the MC procedure 
was continued with one modification. Namely, a MC step 
was rejected if the c-norm was exceeding 105% of the low- 
est value registered so far. The procedure was stopped 
when the desired degree of compaction was obtained. 

Below in this paper, one step dt was chosen as the unit 
of time and one persistence length l p = 50 nm of the 
DNA chain as the unit of distance. The following values 
of parameters were used. The length of one segment was 
chosen as Iq = 0.2, so that one persistence length was 
partitioned into 5 segments. The corresponding value of 
the stiffness parameter was a = 2.403 [ljj- The chain 
radius was r c = 0.06, and the active site was modeled as 
a sphere of identical radius r a — 0.06 embedded into the 
chain. The step-size of the random walker both inside 
and outside the chain was dr — 0.02, corresponding to a 
diffusion coefficient L> 3d = D ld = dr 2 /6 = 2 • 10~ 4 /3. 

Figure n displays a typical chain, and the radial chain 
density, obtained with Monte Carlo integration and av- 
eraged over 20 different chain conformations. The strong 
increase of chain density towards the center is merely a 
geometric effect and caused by the chain passing through 
the origin. Close to the periphery of the cell, the den- 
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sity was rapidly declining since the contour was forced 
to bend back inwards. Within a radius of r < 4, how- 
ever, the chain content remained reasonably constant, 
and the medium could be regarded as approximately ho- 
mogeneous. 



4. COMPUTATION OF THE SPECIFIC 
RECURRENCE TIME 

To compute the specific recurrence time tr of Eq. J3| , 
the recurrence time inside a small test system (here: the 
central binding target of radius R a ) has to be determined. 
To achieve that, the entire system, i.e. the spherical tar- 
get and a short piece of chain, was embedded into a cube 
of 4i? Q side-length with reflective walls. In principle, the 
size of the cube should be of no relevance, but it was 
found that, if chosen too small, effects of the finite step- 
size were emerging. The walker started inside the sphere. 
Each time upon leaving the spherical volume a collision 
was noted. If the walker was about to exit the cylindrical 
volume of the chain, it was reflected back inside with the 
probability 1 — p (Eq. |HJ) . The clock was halted as long 
as the walker moved outside the sphere and only counted 
time-steps inside the sphere. The resulting recurrence 
time t r has to be divided by the effective volume of the 
central target, Eq. (JHJ, to yield the specific recurrence 
time tr. Table |U contains the results for a set of different 
walker-chain affinities. 




r I persistence lengths) 



FIG. 2: First passage times (left) and number of collisions 
(right) as a function of the reaction coordinate r, for various 
exit probabilities p = 2~ l and I = 3, 5, 7, 9, 11 (bottom to 
top plots). The curves are x 2-n ts of Eq. (|170 (left) and Eq. 
11811 (right) within the range £ < r < 4 and extrapolated to 



5. DIFFUSION INSIDE THE CELL 

The goal is to analyze the propagation of the walker 
within a small cell of radius Rs and to extrapolate the 
results to a larger system of arbitrary size Rl > Rs- 
As a test site we have set up a cell of radius R = 6, 
containing a chain of length L = 400.2 (Figure QJ. The 
walker was starting at the center (r = 0) and moving 
towards the periphery of the cell. Such a process shall be 
denoted as run. Whenever the walker returned back to 
the binding site (r < R a ), one collision was noted. A set 
of 2000 runs, including 20 different chain conformations, 
was carried out for each value of the exit parameter p, 
which is related to the walker-chain affinity via Eq. Ipjjl. 
For a set of reaction coordinates r^, the first arrival times 
were monitored, as well as the number of collisions that 
had occurred before first passage. 

5.1. The effective diffusion coefficient 



with an effective diffusion coefficient D e ff(p). For low and 
moderate values of the walker-chain affinity, the arrival 
times were well described when assuming regular diffu- 
sion, i.e. an exponent of a = 2. At high walker-chain 
affinities, this exponent was growing larger, indicating 
the onset of anomalous subdiffusion. Table [I] contains 
the fit parameters when the fits were carried out within 
the range £ < r < 4, and the solid curves in figure (left) 
display the resulting functional form of Eq. (|17fl . when 
extrapolated to the full range < r < 6. 

The lower boundary of the fit range, the sliding length 
£, was implemented because the near the central target, 
the transport process was dominated by one dimensional 
sliding rather than three dimensional diffusion. The up- 
per boundary was introduced since the chain distribution 
beyond r > 4 was affected by boundary effects near the 
periphery of the cell, as is clearly visible at Figure ^ 
Within the range of £ < r < 4, however, the propaga- 
tion of the walker could approximately be regarded as 
a random walk inside a homogeneous and crowded envi- 
ronment. 



Figure [3 displays the first arrival times (left) for differ- 
ent exit probabilities p. To analyse the diffusive proper- 
ties of the propagation, the arrival times were fitted using 
the macroscopic diffusion law 



5.2. The functional dependence of iV co i] on the 
target-distance 



(17) 



The right hand side of Figure [21 displays the number 
of collisions iVcoll as a function of the radius r for var- 
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ious walker-chain affinities. Quite generally, there exists 
a steep increase close to the central target, after which 
the function gradually levels off to reach a plateau. In 
Appendix ^ we argue that this functional behavior can 
be described as 



JVcoii(r) 



JVoo • (r - Res) 



(18) 



where stands for the asymptotic limit N co u(r — > oo) 
and R c g defines an effective target size. As a result of 
facilitated diffusion, the mode of propagation is predom- 
inantly one-dimensional near the central target. This 
relation is therefore invalid within a radius of the aver- 
age sliding length of the walker and should be applied for 
r > £. Under this condition, both and i? c ff were used 
as free fit-parameters and the fit range was restricted to 
£ < r < 4, for the same reason as discussed in Sec. 15.11 
The solid curves of Figure |3 (right) display the best fits 
(extrapolated to r = 6), and Table [I] contains the corre- 
sponding values for the fit-parameters. 

An alternative approach to N co n(r) is described in Ap- 
pendix El leading to 



iVcoii(r) 



fir) 
V cS f R 



(19) 



where f(r) is defined in Eq. l|B2jl . It contains both pa- 
rameters -D e ff and i? e ff which are used as free fit param- 
eters to determine the effective diffusion coefficient and 
an effective target size. The results are given in Table 
G] The effective volume V e g (r) as a function of radius r 
is actually a complicated function that depends on the 
radial chain density (Fig. [TJ, but for this investigation 



we have assumed a perfectly homogeneous chain density 
and evaluated 



Kff(r) 



V{r)V cB (R) 
V(R) 



(20) 



with the cell-radius R = 6. When comparing the best 
fits for the effective diffusion coefficient with the results 
of Eq. (jT7Jl . the agreement is only qualitative. In fact, 
Eq. (|2C)|> does not deliver an accurate way to determine 
-D ff. This may be so because the second term of function 



3D cS 



r 

Res 



R ls 



2r 2 



the fraction i? 2 ff /r 2 , quickly drops to zero and hence both 
fit-parameters Z? ff and R e g become linear dependent. 
This implies that D c g is actually determined locally, close 
to the (effective) target, and not averaged over £ < r < 4. 
Except for high walker-chain affinities, Eq. Ijl7|) delivers a 
more accurate description of the diffusion process, which 
is verified with the quadratic dependence of the passage 
time on the reaction coordinate. 

The effective target size R c g agrees fairly well with the 
corresponding findings of Eq. (fT%|l and increases substan- 
tially with the walker-chain affinity. As a consequence 
of facilitated diffusion, the walker initially moves away 
from the target in one-dimensional sliding mode, and its 
(effectively) free diffusion begins further outside, thereby 
increasing the effective target size. Hence it is no surprise 
to find i? c ff being of similar dimension as the average slid- 
ing length £ (Table D). 



TABLE I: The first column is the exponent of the exit prob- 
ability p = 2 _! , the second column the corresponding sliding 
parameter, followed by the specific recurrence time (Sect.^J. 
The next six columns contain optimized parameters of the \ 2 ~ 
fits of equations 1171 . 1181 and 1191 1. The last column defines 
the speed up achieved with the extrapolation from Rs = 4 to 
Rl = 6, when compared with the explicit simulation of the 
reaction time tba(Rl)- 
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4464 
0.042 2594 
0.073 1413 
0.112 741.6 
0.164 379.7 
0.236 192.6 
0.337 96.81 
0.478 48.62 
0.677 24.30 
0.959 12.17 
1.357 6.089 
1.920 3.044 

units of 10~ 
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6.66 2 
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6.35 2 
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3.67 2 
2.83 2 
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2.44 2.20 
2.41 2.27 
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0.064 3.83 
0.078 6.42 
0.084 9.95 
0.100 15.7 
0.118 25.0 
0.174 39.7 
0.224 61.9 
0.319 95.3 
0.417 135 
0.59 199 
0.81 279 
1.10 398 


6.09 0.060 
5.98 0.069 
6.42 0.079 
6.73 0.092 
6.42 0.116 

5.27 0.167 

4.28 0.231 
2.95 0.348 
2.09 0.491 
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5.3. The empirical MEC-E equations 

It is now possible to combine Equations (|18[) and l|17|l 
with ||TJ and to obtain the empirical MEC-E equa- 
tions 

t B a(p, r) = (A coU (j3, r) + 1) f R V e g - r f (p, r) , (21) 

which allow to evaluate the reaction time tba (p, r) for 
any reaction coordinate r by extrapolation of the number 
of collisions N co \i(p, r) and the first arrival times 17 (p, r). 
When using Eq. I|19fl instead of l|18f) . we obtain 

TS - ff ^ r) = 3DM ■ [rM + — -2)> 

which can be regarded as an empirical generalization of 
Szabo's exact result for free diffusion, Eq. (|10fl . 

Since both sets of equations are based on the MEC ap- 
proach, while employing different ways to extrapolate the 
number of collisions to large cells, we will refer to them as 
MEC-E equations. In the following section we will apply 
both approaches, Eq. I|21|) and Eq. Q22p. to extrapolate 
the reaction times to large cell radii, and compare their 
results. 
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FIG. 3: Reaction time tba of the protein as a function of the 
sliding length Eq. II H . The explicit simulation (solid dots) 
required about 140 times the number of simulation steps of 
the extrapolation using Eq. (1'2 1 fi (open circles) or Eq. 122|l 
(triangles). The curve is the analytical estimate Eq. HI IB . 



6. RESULTS 

As a first consistency-check, the MEC-E equations 
were applied to estimate the reaction time tba of the 
walker entering the cell at radius R = 6. The simulation 
of the reaction B — > A was additionally carried out ex- 
plicitly and the results are displayed in Figure The ef- 
fective volume of the cell was evaluated via Eq. (jHJ , using 
the total chain length L = 400.2. The results, shown in 
Figure |3 imply that the extrapolation from Rs — 4 (the 
radius used to optimize the parameters) to Rl = 6 deliv- 
ered accurate results for the reaction times. This should 
not be taken for granted, taking into account the prob- 
lematic chain distribution between Rs < r < Rl- In fact, 
Tf(p, r) becomes inaccurate in this region due to anoma- 
lous diffusion (Figure left), but this term contributes 
just a small amount to Eq. (|21[) . since for reasonably large 
cells the first arrival time ry is small compared to the cor- 
responding reaction time tba- Its error was therefore of 
little impact. On the other side, the collisions N co \\(r) 
with the central target, which form the main contribu- 
tion to Eq. (|21(l . were much less affected by the chain 
distribution far outside the center (Figure right) and 
were extrapolated accurately, despite of the sparse chain 
density at the cell periphery. This feature contributes 
to the fact that the extrapolation process appears to be 
insensitive to the chain distribution far away from the 
target. Similarly, Eq. I|22|) delivered consistent and accu- 
rate results, except for the last data point which belongs 




12 14 16 18 20 
Cell size R L (persistence lengths) 



FIG. 4: Extrapolation of the reaction time tba to large cell 
radii Rl- The dotted curve is the analytical estimate Eq. 
lllll . the solid and dashed curves correspond to the MEC- 
E equations 12 H and 1221 1. respectively. Upper triple: p = 
1 (free diffusion). Lower triple: p — 2 -8 , where facilitated 
diffusion is most effective. 



to the highest walker-chain affinity. Here, the sliding- 
length already reaches one half of the system size that 
was used to fit the empirical parameters. A larger di- 
mensioned test system is required for such high affinities 
to increase the accuracy of the extrapolation procedure. 

The simulation time required to set up the MEC-E 
equations (|21|l and l|22l) equals the average number of 
time steps the walker needed to reach the radius Rs = 4 
when starting at the central target, i.e. T/(p, Rs)- Com- 
pared to the corresponding time required to simulate 
tba(Rl) explicitly, a speed up between 50 and 500 was 
gained, depending upon walker-chain affinity (Table [I] 
last column). Integrated over all 12 data points, a total 
speed up of 140 was derived. 

It is possible and intended to exploit this method for 
extrapolations to much larger systems. Figure 0] displays 
the extrapolation of tba(p, Rl) up to Rl = 20 for p = 1 
(free diffusion) and p = 2~ 8 , close to the minimum in 
Figure |21 The chain density was assumed to remain con- 
stant, i.e. its length was growing as L(Rl) ~ R L - Ex- 
plicit simulations of tba are not feasible any more for 
such large cells. However, for free diffusion, Eq. (|10|) is 
available, and both extrapolation methods delivered re- 
action times about 8% above the exact solution, which, in 
this plot, was un-distinguishable from the approximation 
Eq. (|15|) . When protein-chain interaction was enabled, 
both extrapolation methods delivered almost identical 
results, which were about 15% above the analytical esti- 
mate Eq. {TTJ. 
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SUMMARY 



APPENDIX A: PROOF OF EQUATION (T5J 



In this work, the empirical MEC-E equations (|2I[1 and 
(|22|l were derived and tested against random walk sim- 
ulations. Whereas the original MEC approach (Sect. 
12. 2|) represents an exact method to obtain the average 
reaction time tba by simulating the much faster back- 
reaction A — > B, it still requires to set up a model sys- 
tem of full size R. This would become prohibitive in 
simulations of large cells containing realistic chains with 
thermal fluctuations and hydrodynamic interactions. 

We have demonstrated that the simulation of a test 
system of moderate size is sufficient to extract reaction 
times of much larger cells. This is so because the number 
of collisions as a function of the cell radius, N co \\(r), is 
asymptotically approaching a plateau (Figure right). 
In this region, the reaction time is merely proportional 
to the effective volume V g, as shown in Eq. (JjHJ, with a 
small correction in form of the first passage time 17 (r), 
Eq. H17fl . This quantity is easily estimated once the effec- 
tive diffusion coefficient is determined. If the test system 
is too small for N co i\(r) to reach the plateau, it is still pos- 
sible to obtain accurate results, because the functional 
form of this quantity is known (Eq. ^] and I19|) , so that 
extrapolations to larger cells become feasible. 

The size of the test system has to be chosen with 
care, because only those regions are of use in which the 
walker experiences a randomized and approximately ho- 
mogeneous environment. Within the central region, typ- 
ically of the size of the sliding length £, the reaction 
time is dominated by 1-dimensional (sliding) instead of 
3-dimensional diffusion. This part of the cell has to 
be excluded when the walker's diffusion properties are 
analyzed. The same holds true for the outermost re- 
gion, where the chain conformation exhibits boundary 
effects. Assuming that the sliding length £ does not ex- 
ceed the persistence length l p , a cell radius R of five 
persistence lengths appears adequate. Here, the region 
£ < r < R — 2l p may be exploited to set up the empiri- 
cal equations l|2I[l or l|22() . With increasing walker-chain 
affinity and sliding length, the radius R has to be ad- 
justed accordingly. 

The results presented above demonstrate how the 
MEC-E approach delivers a speed up between 50 and 500 
(depending on walker-chain affinity, Table P) by extrap- 
olation from Rs = 4 to Rl — 6, with respect to explicit 
simulations of the reaction time tba- With increasing 
radius Rl, Eq. (|2^|) is approximated as 



As was shown by Berg the probability of a walker, 
after starting at r lnl (where R a < r lnl < R), to be ad- 
sorbed at R a , before reaching the distance R, is 



P(R) 



Ra(R-r ini ) 

fmi(-R - Ra) 



(Al) 



This was derived from the steady-state solution of Fick's 
second equation for spherical symmetry, 



L± Q^frV) ... 

r 2 dr \ dr 



(A2) 



Here, C(r) is the concentration, having a maximum at 
the particle source radius r = and dropping to zero 
at the adsorbers radii r = R a and r = R. 

In our case, not the probability P{r), but the average 
number N co \\(r) of events in which the walker returns to 
r = R a before first reaching the distance r = R is of 
interest. We shall now assume that N co n(r) is known for 
one particular distance r, and we want to derive N co n(r + 
dr). The probability, that the walker, starting from r, 
goes straight to r+dr, is 1 — dP(r). Then, the probability 
to first return back to the target, before passing through 
r and reaching r + dr, is dP (1 — dP). In this latter case, 
2 • N co u(r) + 1 collisions have already occurred in average. 
The probability to return exactly n times to the target 
and back to r before reaching r + dr is dP n (I — dP), 
yielding (n + I) • -/V co n(r) + n collisions. The sum 



N co n{r + dr) = (1 - dP) ^[niV coll (r) + n - 1] dP 



n-l 



N coU (r)+l 



- 1 



1-dP 

leads to the differential equation 

dN(r) _ N . dP + dP 
dr dr dr 

With Eq. JSB 

we further have 

R a dr 



dP 



r{r~R a ) 
so that Eq. l|A4j) is solved as 



N coU (r) = (Nee + 1) (r Ra) -1 



(A3) 



(A4) 



(A5) 



(A6) 



TSz,bs{Rl Rett) 



Rl 



3D c gR c ff 



(23) 



and the speed up is therefore approximately growing pro- 
portional to R L . 



Here, = N co n(r — > oo) is the asymptotic limit for the 
number of collisions far away from the target. This so- 
lution is incorrect close to the target, where N co u(R a ) = 
— I. In fact, the validity of this approach is restricted to 
length scales that are large compared to the (finite) step- 
size. In particular, since we want extrapolate iV co ii(r) to 



8 



a large distance, we can assume r to be large enough so 
that N co n(r) ^> 1. Then, the sum Eq. (|A3() simplifies to 



N coll (r + dr) = (l-dP)J2nN coll (r)dP n - 1 



leading to 



N co g{r) 
1-dP 



ar ar 



which finally solves to 

iVcoii(r) = N a 



(r - Ra) 



(A7) 



(A8) 



(A9) 



Both parameters iVoo and R a were used as free fit param- 
eters. We have verified that Eq. (|A6|) and Eq. (|A9|) de- 
liver identical results when extrapolating to large radii, so 
that, for sake of simplicity, Eq. (|A9|) was applied through- 
out this work. 



APPENDIX B: PROOF OF EQUATION 

When considering Eq. Q), 

TBA + TAB = (-/Vcoll + 1) • TR , 



we note that in case of free diffusion the reaction time 
tba is given by Eq. l(TT))) and tab by Eq. IpTf with the 
free diffusion coefficient D, Eq. (|13|) . so that 



(iV C on(r) + 1) • T R (r) = f(r) 



and 



with r > R a . Using Eq. J3J) we obtain 



^coll(r) = . r - 1 

tr V cS (r) 



(Bl) 



(B2) 



(B3) 



Both quantities D and the effective source radius R a are 
used as free fit parameters. 



[1] A.D. Riggs, S. Bourgeois and M. Cohn, The lac repressor- 
operator interaction. 3. Kinetic studies, J. Mol. Biol. 53, 
401 (1970). 

[2] P.H. Richter and M. Eigen, Diffusion controlled reaction 
rates in spheroidal geometry. Application to repressor- 
operator association and membrane bound enzymes, Bio- 
phys. Chem., 2, 255 (1974). 

[3] O.G. Berg and P.H. von Hippel, Diffusion- controlled 
macromolecular reactions, Annu. Rev. Biophys. Chem. 
14, 130 (1985). 

[4] M. Ptashne and A. Gann, Genes and Signals. Cold 
Spring Harbor Laboratory Press, Cold Spring Harbor, 
NY. (2001). 

[5] O.G. Berg, R.B. Winter and P.H. von Hippel, Diffu- 
sion driven mechanisms of protein translocation on nu- 
cleic acids. 1. Models and theory, Biochemistry 20, 6929 
(1981). 

[6] H.X. Zhou and A. Szabo, Enhancement of Association 
Rates by Nonspecific Binding to DNA and Cell Mem- 
branes, Phys. Rev. Lett. 93, 178101 (2004). 

[7] S.E. Halford and J.F. Marko, How do site-specific DNA- 
binding proteins find their targets?, Nucleic Acids Re- 
search 32, 3040 (2004). 

[8] K. Klenin, H. Merlitz, J. Langowski and C.X. Wu, Fa- 



cilitated diffusion of DNA-binding proteins, Phys. Rev. 
Lett. 96, 018104 (2006). 
[9] Tao Hu, A.Yu. Grosberg, B.I. Shklovskii, How do pro- 
teins search for their specific sites on coiled or globular 
DNA, arXiv:q-bio.BM/0510043 (2005). 

[10] H. Merlitz, K. Klenin, C.X. Wu and J. Langowski, Facil- 
itated diffusion of DNA-binding proteins: Efficient sim- 
ulation with the method of excess collisions (MEC), J. 
Chem. Phys. 124 (2006) (in print). 

[11] K.V. Klenin and J. Langowski, Modeling of intramolec- 
ular reactions of polymers: An efficient method based on 
Brownian dynamics simulations, J. Chem. Phys. 121, 
4951 (2004). 

[12] A. Szabo, K. Schulten and Z. Schulten, First passage time 
approach to diffusion controlled reactions, J. Chem. Phys. 
72, 4350 (1980). 

[13] K. Klenin, H. Merlitz and J. Langowski, A Brownian 
Dynamics Program for the Simulation of Linear and Cir- 
cular DNA and other Wormlike Chain Poly electrolytes, 
Biophys. J. 74, 780 (1998). 

[14] Howard C. Berg, Random walks in Biology, Princeton 
University Press, expanded edition (1993). 



